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What is Data? 


Need of Data 
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explained best in quantitative terms. Statistical analysis of those variables has 
become a necessity today. For example, to study cropping pattern of an area, it 
is necessary to have statistical information about the cropped area, crop yield 
and production, irrigated area, amount of rainfall and inputs like use of fertiliser, 
insecticides, pesticides, etc. Similarly, data related to the total population, density, 
number of migrants, occupation of people, their salaries, industries, means of 
transportation and communication is needed to study the growth of a city. Thus, 
data plays an important role in geographical analysis. 


Presentation of the Data 


You might have heard the story of a person who was travelling with his wife and 
a five-year old child. On his way, he had to cross a river. Firstly, he fathomed the 
depth of the river at four points as 0.6, 0.8, 0.9 and 1.5 metres. He calculated the 
average depth as 0.95 metres. His child’s height was 1 metre. So, he led them to 
cross the river and his child drowned in the river. On the other bank, he sat 
pondering: “Lekha Jokha Thahe, to Bachha Dooba Kahe?” (Why did the child 
drown when average depth was within the reach of each one?). This is called 
statistical fallacy, which may deviate you from the real situation. So, it is important 
to collect the data to know the facts and figures, but equally important is the 
presentation of data. Today, the use of statistical methods in the analysis, 
presentation and in drawing conclusions plays a significant role in almost all 
disciplines, including geography, which use the data. It may, therefore, be inferred 
that the concentration of a phenomenon, e.g., population, forest or network of 
transportation or communication not only vary over space and time but may 
also be conveniently explained using the data. In other words, you may say that 
there is a shift from qualitative description to quantitative analysis in explaining 
the relationship among variables. Hence, analytical tools and techniques have 
become more important these days to make the study more logical and derive 
precise conclusion. Precise quantitative techniques are used right from the 
beginning of collecting and compiling data to its tabulation, organisation, ordering 
and analysis till the derivation of conclusions. 


Sources of Data 


The data are collected through the following ways. These are: 1. Primary Sources, 
and 2. Secondary Sources. 

The data which are collected for the first time by an individual or the group 
of individuals, institution/organisations are called Primary sources of the 
data. On the other hand, data collected from any published or unpublished 
sources are called Secondary sources. Fig. 1.1 shows the different methods 
of data collection. 


Sources of Primary Data 


1. Personal Observations 

It refers to the collection of information by an individual or group of individuals 
through direct observations in the field. Through a field survey, information about 
the relief features, drainage patterns, types of soil and natural vegetation, as well 
as, population structure, sex ratio, literacy, means of transport and 
communication, urban and rural settlements, etc., is collected. However, in 
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carrying out personal observations, the person(s) involved must have theoretical 
knowledge of the subject and scientific attitude for unbiased evaluation. 


2. Interview 

In this method, the researcher gets direct information from the respondent 
through dialogues and conversations. However, the interviewer must take the 
following precautions while conducting an interview with people of the area: N 


j 


" 
Ssy-aeq S 





(i) A precise list of items about which information is to be gathered from “© 
the persons interviewed be prepared. 

(ii) The person(s) involved in conducting the interview should be clear about j . 
the objective of the survey. ¥ 


(iii) The respondents should be taken into confidence before asking any sensitive Ss 
question and he/she be assured that the secrecy will be maintained. 


e 90INO 


(iv) A congenial atmosphere should be created so that the respondent may r 
explain the facts without any hesitation. s 


a 


"r 


x 
uonejdwog 


(v) The language of the questions should be simple and polite so that D 
the respondents feel motivated and readily agree to give fil ors 
information asked for. { } | 


(vi) Avoid asking any such question that may hurt the self-respect of the- ‘ 


religious feelings of the respondent. ; 


(vii) At the end of the interview, ask the respondent what addita ys ok 


provided by him/her. 


(viii) Pay your thanks and gratefulness for sparing his/her valuable time 


information he/she may provide, other than what has already been é Pat 
a. 
for you. 
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3. Questionnaire/Schedule 

In this method, simple questions and their possible answers are written on a 
plain paper and the respondents have to tick-mark the possible answers from 
the given choices. At times, a set of structured questions are written and sufficient 
space is provided in the questionnaire where the respondent write their opinion. 
The objectives of the survey should be clearly mentioned in the questionnaire. 
This method is useful in carrying out the survey of a larger area. Even 
questionnaire can be mailed to far-flung places. The limitation of the method is 
that only the literate and educated people can be approached to provide the 
required information. Similar to the questionnaire that contains the questions 
pertaining to the matter of investigation is the schedule. The only difference 
between the questionnaire and the schedule is that the respondent himself/ 
herself fills up the questionnaires, whereas, a properly trained enumerator himself 
fills up schedules by asking question addressed to the respondents. The main 
advantage of schedule over the questionnaire is that the information from both 
literate and illiterate respondents can be collected. 


4. Other Methods 
The data about the properties of soil and water are collected directly in the field 
by measuring their characteristics using 
soil kit and water quality kit. Similarly, 
field scientists collect data about the 
health of the crops and vegetation using 
transducers (Fig. 1.2). 


Secondary Source of Data 


Secondary sources of data consist of 
published and unpublished records 
which include government publications, 
documents and reports. 





Fig. 1.2 : Field Scientist taking 
Published Sources Measures of Crop Health 


1. Government Publications 


The publications of the various 
ministries and the departments of the 
Government of India, state governments 
and the District Bulletins are one of the 
most important sources of secondary 
information. These include the Census 
of India published by the Office of the 
Registrar General of India, reports of the 
National Sample Survey, Weather 
Reports of Indian Meteorological et Sere 
Department and Statistical Abstracts INDIA 
published by state governments, and eR 
the periodical reports published by 
different Commissions. Some of the 
government publications are shown in Fig. 1.3 : Some of the Government 
Fig. 1.3. Publications 
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2. Semi/Quasi-government Publications 


The publications and reports of Urban Development Authorities and Municipal 
Corporations of various cities and towns, Zila Parishads (District Councils), etc. 
fall under this category. 


3. International Publications 


The international publications 
comprise yearbooks, reports and 
monographs published by different 


agencies of the United Nations, such f> 
as United Nations Educational, 
Scientific and Cultural Organisation Human Developmen E | 4 
(UNESCO), United Nations eee 


Development Programme (UNDP), 
World Health Organisation (WHO), 
Food and Agricultural Organisation 
(FAO), etc. Some of the important 
publications of the United Nations 
that are periodically published are 
Demographic Year Book, Statistical 
Year Book and the Human 
Development Report (Fig. 1.4). 





Fig. 1.4 : Some of the United Nations 
4, Private Publications Publications 


The yearbooks, surveys, research reports and monographs published by 





newspapers and private organisations fall under this category. ES 
5 
5. Newspapers and Magazines PRE 
The daily newspapers and the weekly, fortnightly and monthly magazines serve i { al a ~. 
as easily accessible sources of secondary data. ke Ae 3 
2 S 
Sine: = Lon 
6. Electronic Media t Zj < w i 
UN ' s * 
The electronic media, specially the internet, has emerged as a major source of i j 1 g » Pr 
secondary data in recent times. ` 3 4° 
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1. Government Documents 


The unpublished reports, monographs and documents are yet another source 
of secondary data. These documents are prepared and maintained 





important source of village-level information. p= at aie 


2. Quasi-government Records er A p 


th s 
The periodical reports and the development plans prepared and maintained by g I, 
different Municipal Corporations, District Councils and Civil Services departments $ ee X J 
are included in Quasi-government records. j h k 
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3. Private Documents 


These include unpublished reports and records of companies, trade unions, 
different political and apolitical organisations and residents’ welfare associations. 


Tabulation and Classification of Data 


The data collected from primary or secondary sources initially appear as a big 
jumble of information with the least of comprehension. This is known as raw 
data. To draw meaningful inferences and to make them usable the raw data 
requires tabulation and classification. 

One of the simplest devices to summarise and present the data is the 
Statistical Table. It is a systematic arrangement of data in columns and rows. 
The purpose of table is to simplify the presentation and to facilitate comparisons. 
This table enables the reader to locate the desired information quickly. Thus, the 
tables make it possible for the analyst to present a huge mass of data in an 
orderly manner within a minimum of space. 


Data Compilation and Presentation 


Data are collected, tabulated and presented in a tabular form either in absolute 
terms, percentages or indices. 


Absolute Data 


When data are presented in their original form as integers, they are called absolute 
data or raw data. For example, the total population of a country or a state, the 
total production of a crop or a manufacturing industry, etc. Table 1.1 shows the 
absolute data of population of India and some of the selected states. 


Table 1.1 : Population of India and Selected States/Union Territories, 2011 


State/ India/State/ Total Population 
UT Code Union Territory 


— 


INDIA! 1,21,05,69,573 | 62,31,21,843 |58,74,47,730 
Jammu and Kashmir? 1,25,41,302 66,40,662 59,00,640 
Himachal Pradesh 68,64,602 34,81,873 33,82,729 
Punjab 2,77,43,338 1,46,39,465 1,31,03,873 
Chandigarh? 10,55,450 5,80,663 4,74,787 
Uttarakhand 1,00,86,292 D1,37,7/3 49,48,519 
Haryana 2,53,51,462 1,34,94,734 1,18,56,728 
National Capital 1,67,87,941 89,87 ,326 78,00,615 
Territory of Delhi 

Rajasthan 6,85,48,437 3,55,50,997 3,29,97,440 
Uttar Pradesh 19,98,12,341 10,44,80,510 9,53,31,831 
Bihar 10,40,99,452 5,42,78,157 4,98,21,295 





! inclusive of all territorial boundary of India 
2 excluding PoK 
3 Union Territory 


Source : Census, 201 1 
Percentage/Ratio 


Some time data are tabulated in a ratio or percentage form that are computed 
from a common parameter, such as literacy rate or growth rate of population, 
percentage of agricultural products or industrial products, etc. Table 1.2 presents 
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literacy rates of India over the decades Table 1.2 : Literacy Rate : 1951 - 2011 
in a percentage form. Literacy rate is 
calculated as: 


Person Female 


Total Literates 


Total Population ALUN 


Index Number 


An index number is a statistical 
measure designed to show changes in 
variable or a group of related variables Source: Census, 2011 

with respect to time, geographic 

location or other characteristics. It is to be noted that index numbers not only 
measure changes over a period of time but also compare economic conditions of 
different locations, industries, cities or countries. Index number is widely used 
in economics and business to see changes in price and quantity. There are various 
methods for the calculation of index number. However, the simple aggregate 
method is most commonly used. It is obtained using the following formula: 


Xq 


Ya x100 


X qı = Total of the current year production 








X do = Total of the base year production 


Generally, base year values are taken as 100 and index number is calculated 
thereupon. For example, Table 1.3 shows the production of iron ore in India and 
the changes in index number from 1970-71 to 2000-01 taking 1970-71 as the 


base year. Table 1.3 : Production of Iron Ore in India 






" 
S syed” 


Production Calculation Index Number 
(in million tonnes) 
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1990-91 
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2000-01 





Source — India: Economic Year Book, 2005 


Processing of Data 


The processing of raw data requires their tabulation and classification in seldoted ; 4 
classes. For example, the data given in Table 1.4 can be used to understand how “, 


they are processed. a N | ( 


We can see that the given data are ungrouped. Hence, the first step is to “yer 
eee ae 


4 ` we 


group data in order to reduce its volume and make it easy to understand. 
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Table 1.4 : Score of 60 Students in Geography Paper 





Grouping of Data 


The grouping of the raw data requires determining of the number of classes in 
which the raw data are to be grouped and what will be the class intervals. The 
selection of the class interval and the number of classes, however, depends upon 
the range of raw data. The raw data given in Table 1.4 ranges from 02 to 96. We 
can, therefore, conveniently choose to group the data into ten classes with an 
interval of ten units in each group, e.g. 0-10, 10-20, 20-30, etc. (Table 1.5). 


Table 1.5 : Making Tally Marks to Obtain Frequency 
Group Numerical of Raw Data Tally Marks Number of Individual 


a00 | 47.46.44.48.4241 | 74/6 


50-60 53,57,50,51,58, 10 
TTFA/// 
59,56,58,56,59 


a | emme 
63,64,68,64 

70-80 | 74.73.75:70.7471 | TALI | 6 | 
20-00 Tenna T | 3 O O 


90-100 | 96,92,90,90 


A e O E 





Process of Classification 


Once the number of groups and the class interval of each group are determined, 
the raw data are classified as shown in Table 1.5. Itis done by a method popularly 
known as Four and Cross Method or tally marks. 

First of all, one tally mark is assigned to each individual in the group in which 
it is falling. For example, the first numerical in the raw data is 47. Since, it falls in 
the group of 40-50, one tally mark is recorded in the column 3 of Table 1.5. 


Frequency Distribution 


In Table 1.5 we have classified the raw data of a quantitative variable and have 
grouped them class-wise. The number of individuals (places in the fourth column 
of Table 1.5) is known as frequency and the column represents the frequency 
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distribution. It illustrates how the Table 1.6 : Frequency Distribution 
different values of a variable are 


distributed in different classes. Group 
Frequencies are classified as Simple and 7 


Cumulative frequencies. 10-20 


Simple Frequencies a 


“ty | 
It is expressed by ‘f and represent the 


number of individuals falling in each - 
group (Table 1.6). The sum of all the EA L 


frequencies, assigned to all classes, 
represents the total number of individual 
observations in the given series. In 
statistics, it is expressed by the symbol N 
that is equal to X J . It is expressed as 
X. f =N=60 (Table 1.5 and 1.6). 
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Cumulative Frequencies 
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It is expressed by ‘Cf’ and can be obtained by adding successive simple 
frequencies in each group with the previous sum, as shown in the column 3 of 
Table 1.6. For example, the first simple frequency in Table 1.6 is 4. Next frequency f 


ias 
W 


or 
i | ing \ 
f >> 


of 5 is added to 4 which gives a total of 9 as the next cumulative frequency. oe 

Likewise, add every next number until the last cumulative frequency of 60 is Å TA a 4 

obtained. Note that it is equal to N or Y I.: \ \ S V A 
Advantage of cumulative frequency is that one can easily make out that there l i X f $ `y 

are 27 individuals scoring less than 50 or that 45 out of 60 individuals lie below M Ta 

the score of 70. m. a 
Each simple frequency is associated with its group or class. The exclusive or — pe bn 

inclusive methods are used for forming the groups or classes. ata) Pa 


Exclusive Method 
As shown in Table 1.6, two numbers are shown in its first column . Notice that 
the upper limit of one group is the same as the lower limit of the next group. For 
example, the upper limit of the one group (20 — 30) is 30, which is the lower limit 
of the next group (30 - 40), making 30 to appear in both groups. But any 
observation having the value of 30 is included in the group where it is at its lower 
limit and it is excluded from the group where it is the upper limit as (in 20-30 
groups). That is why the method is known as exclusive method, i.e. a group is fos 
excluded of its upper limits. You may now make out where all the marginal 
values of Table 1.4 will go. 

The groups in Table 1.6, are interpreted in the following manner -— 
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O and under 10 10 and under 20 

20 and under 30 30 and under 40 

40 and under 50 50 and under 60 

60 and under 70 70 and under 80 e- 

80 and under 90 90 and under100 n A go 
Hence, in this type of grouping the class extends over ten units. For example, t., a we" 


20, 21, 22, 23, 24, 25, 26, 27, 28 and 29 are included in the third group. — EN ns” E, Y 
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Inclusive Method Table 1.7 : Frequency Distribution 


In this method, a value equal to the 
upper limit of a group is included in 
the same group. Therefore, it is known 
as inclusive method. Classes are 
mentioned in a different form in this 
method, as shown in the first column 
of Table 1.7. Normally, the upper limit 
of a group differs by 1 with the lower 
limits of the next group. It is important 
to note that each group spreads over 
ten units in this method also. For 
example, the group of 50-59 includes 
the ten values i.e. 50, 51, 52, 53, 54, 
55, 56, 57, 58 and 59 (Table 1.7). In 
this method, both the upper and 
lower limit are included to find the frequency distribution. 





Frequency Polygon 


A graph of frequency 
distribution is known 
as the frequency Eq Par 
polygon. It helps in IN —*— Ogive 
comparing two or more | 
than two frequency 
distributions (Fig.1.5). 
The two frequencies are 
shown using a bar 
diagram and a line 
graph respectively. 


No. of Students 





Ogive 

When the frequencies 
are added they are Wlarks Obtained 
called cumulative 
frequencies and are 
listed in a table called 
cumulative frequency table. The curve obtained by plotting cumulative 
frequencies is called an Ogive (pronounced as ojive). It is constructed either by 
the less than method or the more than method. 

In the less than method, we start with the upper limit of the classes and go 
on adding the frequencies. When these frequencies are plotted, we get a rising 
curve as shown in Table 1.8 and Fig. 1.6. 

In the more than method, we start with the lower limits of the classes 
and from the cumulative frequency, we substract frequency of each class. 
When these frequencies are plotted, we get a declining curve as shown in 
Table 1.9 and Fig 1.7. 

Both the Figs. 1.5 and 1.6 may be combined to get a comparative picture of 
less than and more than Ogive as shown in Table 1.10 and Fig. 1.7. 





Fig. 1.5 : Frequency Distribution Polygon 
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Table 1.8 : Frequency Distribution 
less than Method 


Less than Method 


Less than 10 
Less than 20 
Less than 30 
Less than 40 








Less than 50 
Less than 60 
Less than 70 
Less than 80 
Less than 90 
Less than 100 
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Fig. 1.6 : Less than Ogive 





Table 1.9 : Frequency 
Distribution more than Method 


More than Method Cf 


More than O 
More than 10 
More than 20 
More than 30 
More than 40 
More than 50 
More than 60 
More than 70 
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EXcercises 


1. Choose the right answer from the four alternatives given below: 


(i) A number or character which represents measurement is called 


(a) Digit (b) Data (c) Number (d) Character 
(ii) A single datum is a single measurement from the 

(a) Table (b) Frequency (c) Real world (d) Information 
(iii) In a tally mark grouping by four and crossing fifth is called 

(a) Four and Cross Method (b) Tally Marking Method 

(c) Frequency plotting Method (d) Inclusive Method 


(iv) An Ogive is a method in which 
(a) Simple frequency is measured 
(b) Cumulative frequency is measured 
(c) Simple frequency is plotted 
(d) Cumulative frequency is plotted 
(v) If both ends of a group are taken in frequency grouping, it is called 
(a) Exclusive Method (b) Inclusive Method 


(c) Marking Method (d) Statistical Method 
2. Answer the following questions in about 30 words: 
(i) Differentiate between data and information. 
(ii) What do you mean by data processing? 
(iii) What is the advantage of foot note in a table? 


(iv) What do you mean by primary sources of data? 


q Sew eZ WY 


(v) Enumerate five sources of secondary data. 


3. Answer the following questions in about 125 words: 


(i) Discuss the national and international agencies where from secondary data 
may be collected. 


(ii) What is the importance of an index number? Taking an example examine 
the process of calculating an index number and show the changes. 


Activity 


1. In a class of 35 students of Geography, following marks were obtained out of 
10 marks in unit test - 1, O, 2, 3, 4, 5, 6, 7, 2, 3, 4, O, 2, 5, 8, 4, 5, 3, 6, 3, 2, 7, 6, 
5, 4, 3, 7, 8,9, 7,9, 4, 5, 4, 3. Represent the data in the form of a group frequency 
distribution. 


2. Collect the last test result of Geography of your class and represent the marks 
in the form of a group frequency distribution. 
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